Hardware Support for Data Dependence Speculation in Distributed Shared-Memory Multiprocessors Via Cache-block Reconciliation
نویسندگان
چکیده
Data dependence speculation allows a compiler to relax the constraint of data-independence to issue tasks in parallel, increasing the potential for automatic extraction of parallelism from sequential programs. This paper proposes hardware mechanisms to support a data-dependence speculative distributed shared-memory (DDSM) architecture that enable speculative parallelization of programs with irregular data structures and inherent coarse-grain parallelism. E cient support for coarse-grain tasks requires large bu ers for speculative data; DDSM leverages cache and directory structures to provide large bu ers that are managed transparently from applications. The proposed cache and directory extensions provide support for distributed speculative versions of cache blocks, run-time detection of dependence violations, and program-order reconciliation of cache blocks. This paper describes the DDSM architecture and presents a simulation-based evaluation of its performance on ve benchmarks chosen from the Spec95 and Olden suites. The proposed system yields simulated speedups of up to 12.5 in a 16-node con guration for programs with coarse-grain speculative windows (millions of instructions and hundreds of KBytes of speculative data).
منابع مشابه
Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many important applications. Unfortunately, known techniques for run-time parallelization are often computationally expensive or not general enough. To address this problem, we propose new hardware support for e cient run-time...
متن کاملToward Large Scale Shared Memory Multiprocessing
We are currently investigating two di erent approaches to scalable shared memory Munin a distributed shared memory DSM system implemented entirely in software and Willow a true shared memory multiprocessor with extensive hardware support for scalability Munin allows parallel programs written for shared memory multiprocessors to be executed e ciently on dis tributed memory multiprocessors Unlike...
متن کاملA Comparison of Software and Hardware Synchronization Mechanisms for Distributed Shared Memory Multiprocessors
E cient synchronization is an essential component of parallel computing The designers of traditional multiprocessors have included hardware support only for simple operations such as compare and swap and load linked store conditional while high level synchronization primitives such as locks barriers and condition variables have been implemented in software With the advent of directory based dis...
متن کاملEecient Integration of Compiler-directed Cache Coherence and Data Prefetching Compiler-directed Cache Coherence and Data Prefetching
Cache coherence enforcement and memory latency reduction and hiding are very important and challenging problems in the design of large-scale distributed shared-memory (DSM) multiprocessors. We propose an integrated approach to solve these problems through a compiler-directed cache coherence scheme called the Cache Coherence with Data Prefetching (CCDP) scheme. The CCDP scheme enforces cache coh...
متن کاملSpecification-based Verification in a Distributed Shared Memory Simulation Model
The emergence of chip multiprocessors is leading to rapid advances in hardware and software systems to provide distributed shared memory (DSM) programming models, so-called DSM systems. A DSM system provides programming advantages within a scalable and cost-effective hardware solution. This benefit derives from the fact that a DSM system creates a shared-memory abstraction on top of a distribut...
متن کامل